Comparison of HMM and DTW methods in automatic recognition of pathological phoneme pronunciation
نویسندگان
چکیده
In the paper recently proposed Human Factor Cepstral Coefficients (HFCC) are used to automatic recognition of pathological phoneme pronunciation in speech of impaired children and efficiency of this approach is compared to application of the standard Mel-Frequency Cepstral Coefficients (MFCC) as a feature vector. Both dynamic time warping (DTW), working on whole words or embedded phoneme patterns, and hidden Markov models (HMM) are used as classifiers in the presented research. Obtained results demonstrate superiority of combining HFCC features and modified phoneme-based DTW classifier.
منابع مشابه
Automatic recognition of pathological phoneme production.
OBJECTIVE Proper diagnosis and therapy of pathological pronunciation of phonemes play an important role in modern logopedics. To enhance the efficiency of diagnosis and therapy an automatic recognition of pathological phoneme pronunciation is addressed in this paper. The authors focus on the therapy of phoneme substitution disorders. PATIENTS AND METHODS Recognized speech samples come from sp...
متن کاملImproving Phoneme Sequence Recognition using Phoneme Duration Information in DNN-HSMM
Improving phoneme recognition has attracted the attention of many researchers due to its applications in various fields of speech processing. Recent research achievements show that using deep neural network (DNN) in speech recognition systems significantly improves the performance of these systems. There are two phases in DNN-based phoneme recognition systems including training and testing. Mos...
متن کاملPronunciation assessment via a comparison-based system
In this paper, we present preliminary results on applying a comparison-based framework to the task of pronunciation scoring. The comparison-based system works by aligning a student’s utterance with a teacher’s utterance via dynamic time warping (DTW). Features that describe the degree of mis-alignment are extracted from the aligned path and the distance matrix. We focus on a dataset in Levantin...
متن کاملAutomatic Scoring of Shadowing Speech Based on DNN Posteriors and Their DTW
Shadowing has become a well-known method to improve learners’ overall proficiency. Our previous studies realized automatic scoring of shadowing speech using HMM phoneme posteriors, called GOP (Goodness of Pronunciation) and learners’ TOEIC scores were predicted adequately. In this study, we enhance our studies from multiple angles: 1) a much larger amount of shadowing speech is collected, 2) ma...
متن کاملReaction Time in Phoneme Recognition: A Comparative Study among Iranian Upper-Intermediate vs. Advanced EFL Learners at Institute Level
The present study aimed to investigate of reaction time in terms of phoneme recognition: A comparative study among Iranian Upper-Intermediate vs. Advanced EFL Learners at Institute level. The main question this study tried to answer was whether there is no difference in reaction time in terms of phoneme recognition in Iranian learners at Institute level. To answer the question, 5Upper-Intermedi...
متن کامل